智能论文笔记

Analysis of Drug repurposing Knowledge graphs for Covid-19

Ajay Kumar Gogineni

分类：人工智能 | 机器学习

2022-12-07

Knowledge graph (KG) is used to represent data in terms of entities and structural relations between the entities. This representation can be used to solve complex problems such as recommendation systems and question answering. In this study, a set of candidate drugs for COVID-19 are proposed by using Drug repurposing knowledge graph (DRKG). DRKG is a biological knowledge graph constructed using a vast amount of open source biomedical knowledge to understand the mechanism of compounds and the related biological functions. Node and relation embeddings are learned using knowledge graph embedding models and neural network and attention related models. Different models are used to get the node embedding by changing the objective of the model. These embeddings are later used to predict if a candidate drug is effective to treat a disease or how likely it is for a drug to bind to a protein associated to a disease which can be modelled as a link prediction task between two nodes. RESCAL performed the best on the test dataset in terms of MR, MRR and Hits@3.

translated by 谷歌翻译

RoS-KD: A Robust Stochastic Knowledge Distillation Approach for Noisy Medical Imaging

Ajay Jaiswal , Kumar Ashutosh , Justin F Rousseau , Yifan Peng , Zhangyang Wang , Ying Ding

分类：计算机视觉 | 机器学习

2022-10-15

AI-powered Medical Imaging has recently achieved enormous attention due to its ability to provide fast-paced healthcare diagnoses. However, it usually suffers from a lack of high-quality datasets due to high annotation cost, inter-observer variability, human annotator error, and errors in computer-generated labels. Deep learning models trained on noisy labelled datasets are sensitive to the noise type and lead to less generalization on the unseen samples. To address this challenge, we propose a Robust Stochastic Knowledge Distillation (RoS-KD) framework which mimics the notion of learning a topic from multiple sources to ensure deterrence in learning noisy information. More specifically, RoS-KD learns a smooth, well-informed, and robust student manifold by distilling knowledge from multiple teachers trained on overlapping subsets of training data. Our extensive experiments on popular medical imaging classification tasks (cardiopulmonary disease and lesion classification) using real-world datasets, show the performance benefit of RoS-KD, its ability to distill knowledge from many popular large networks (ResNet-50, DenseNet-121, MobileNet-V2) in a comparatively small network, and its robustness to adversarial attacks (PGD, FSGM). More specifically, RoS-KD achieves >2% and >4% improvement on F1-score for lesion classification and cardiopulmonary disease classification tasks, respectively, when the underlying student is ResNet-18 against recent competitive knowledge distillation baseline. Additionally, on cardiopulmonary disease classification task, RoS-KD outperforms most of the SOTA baselines by ~1% gain in AUC score.

translated by 谷歌翻译

RepsNet: Combining Vision with Language for Automated Medical Reports

Ajay Kumar Tanwani , Joelle Barral , Daniel Freedman

分类：计算机视觉

2022-09-27

通过分析医学图像来编写报告对于缺乏经验的从业者和经验丰富的医生来说是错误的。在这项工作中，我们介绍了改编预先训练的视力和语言模型来解释医学图像并以自然语言生成自动报告的Repsnet。 repsnet由一个编码器模型组成：编码器通过对比度学习将图像与自然语言描述对齐，而解码器则通过对编码图像进行调节和通过最近的邻居搜索检索的描述的先验上下文来预测答案。我们在视觉问题回答设置中提出问题，以处理分类和描述性的自然语言答案。我们在放射学图像数据集的两个医学视觉问题回答（VQA-RAD）和报告生成（IU-XRAR）的两个具有挑战性的任务上进行实验。结果表明，REPNET优于最先进的方法，在VQA-RAD 2018上具有81.08％的分类精度和IU-XRAY的0.58 BLEU-1得分。补充详细信息可从https://sites.google.com/view/repsnet获得

translated by 谷歌翻译

Innovations in Neural Data-to-text Generation

Mandar Sharma , Ajay Gogineni , Naren Ramakrishnan

分类：自然语言处理

2022-07-25

在过去十年中引发了自然语言处理（NLP）研究的神经繁荣，同样导致了数据之间的大量创新（DTG）。这项调查提供了对神经DTG范式的合并视图，对方法，基准数据集和评估协议进行了结构化检查。这项调查划出了将DTG与其余自然语言产生（NLG）景观分开的边界，涵盖了文献的最新综合，并突出了更大的NLG伞内外的技术采用阶段。通过这种整体观点，我们重点介绍了DTG研究的有希望的途径，不仅关注具有语言能力的系统的设计，而且还集中在表现出公平和问责制的系统上。

translated by 谷歌翻译

COVID-19 Twitter Dataset with Latent Topics, Sentiments and Emotions Attributes

Raj Kumar Gupta , Ajay Vishwanath , Yinping Yang

分类：自然语言处理

2020-07-14

本文描述了一个关于人们的话语的大型全球数据集以及在Twitter平台上对Covid-19的大流行的反应。从2020年1月28日至2022年6月1日，我们收集并处理了超过2900万个唯一用户的Twitter帖子，使用了四个关键字：“ Corona”，“ Wuhan”，“ NCOV”和“ COVID”。利用概率主题建模和预训练的基于机器学习的情感识别算法，我们将每个推文标记为具有十七个属性，包括a）十个二进制属性，指示了Tweet的相关性（1）或与前十名检测到的主题，B ）五个定量情绪属性表示价或情感的强度程度（从0：极为消极到1：极为积极）以及恐惧，愤怒，悲伤和幸福情感的强度程度（从0：完全不是1到1 ：极度强烈），c）两个分类属性表明情绪（非常负面，消极，中立或混合，积极，非常积极）以及主导的情感（恐惧，愤怒，悲伤，幸福，没有特定的情感），主要是推文表达。我们讨论技术有效性，并报告这些属性的描述性统计，其时间分布和地理表示。本文最后讨论了数据集在传播，心理学，公共卫生，经济学和流行病学中的用法。

translated by 谷歌翻译

e-Inu: Simulating A Quadruped Robot With Emotional Sentience

Abhiruph Chakravarty , Jatin Karthik Tripathy , Sibi Chakkaravarthy S , Aswani Kumar Cherukuri , S. Anitha , Firuz Kamalov , Annapurna Jonnalagadda

分类：机器人 | 机器学习

2023-01-03

Quadruped robots are currently used in industrial robotics as mechanical aid to automate several routine tasks. However, presently, the usage of such a robot in a domestic setting is still very much a part of the research. This paper discusses the understanding and virtual simulation of such a robot capable of detecting and understanding human emotions, generating its gait, and responding via sounds and expression on a screen. To this end, we use a combination of reinforcement learning and software engineering concepts to simulate a quadruped robot that can understand emotions, navigate through various terrains and detect sound sources, and respond to emotions using audio-visual feedback. This paper aims to establish the framework of simulating a quadruped robot that is emotionally intelligent and can primarily respond to audio-visual stimuli using motor or audio response. The emotion detection from the speech was not as performant as ERANNs or Zeta Policy learning, still managing an accuracy of 63.5%. The video emotion detection system produced results that are almost at par with the state of the art, with an accuracy of 99.66%. Due to its "on-policy" learning process, the PPO algorithm was extremely rapid to learn, allowing the simulated dog to demonstrate a remarkably seamless gait across the different cadences and variations. This enabled the quadruped robot to respond to generated stimuli, allowing us to conclude that it functions as predicted and satisfies the aim of this work.

translated by 谷歌翻译

NaQ: Leveraging Narrations as Queries to Supervise Episodic Memory

Santhosh Kumar Ramakrishnan , Ziad Al-Halah , Kristen Grauman

分类：计算机视觉

2023-01-02

Searching long egocentric videos with natural language queries (NLQ) has compelling applications in augmented reality and robotics, where a fluid index into everything that a person (agent) has seen before could augment human memory and surface relevant information on demand. However, the structured nature of the learning problem (free-form text query inputs, localized video temporal window outputs) and its needle-in-a-haystack nature makes it both technically challenging and expensive to supervise. We introduce Narrations-as-Queries (NaQ), a data augmentation strategy that transforms standard video-text narrations into training data for a video query localization model. Validating our idea on the Ego4D benchmark, we find it has tremendous impact in practice. NaQ improves multiple top models by substantial margins (even doubling their accuracy), and yields the very best results to date on the Ego4D NLQ challenge, soundly outperforming all challenge winners in the CVPR and ECCV 2022 competitions and topping the current public leaderboard. Beyond achieving the state-of-the-art for NLQ, we also demonstrate unique properties of our approach such as gains on long-tail object queries, and the ability to perform zero-shot and few-shot NLQ.

translated by 谷歌翻译

Statistical Machine Translation for Indic Languages

Sudhansu Bala Das , Divyajoti Panda , Tapas Kumar Mishra , Bidyut Kr. Patra

分类：自然语言处理

2023-01-02

Machine Translation (MT) system generally aims at automatic representation of source language into target language retaining the originality of context using various Natural Language Processing (NLP) techniques. Among various NLP methods, Statistical Machine Translation(SMT). SMT uses probabilistic and statistical techniques to analyze information and conversion. This paper canvasses about the development of bilingual SMT models for translating English to fifteen low-resource Indian Languages (ILs) and vice versa. At the outset, all 15 languages are briefed with a short description related to our experimental need. Further, a detailed analysis of Samanantar and OPUS dataset for model building, along with standard benchmark dataset (Flores-200) for fine-tuning and testing, is done as a part of our experiment. Different preprocessing approaches are proposed in this paper to handle the noise of the dataset. To create the system, MOSES open-source SMT toolkit is explored. Distance reordering is utilized with the aim to understand the rules of grammar and context-dependent adjustments through a phrase reordering categorization framework. In our experiment, the quality of the translation is evaluated using standard metrics such as BLEU, METEOR, and RIBES

translated by 谷歌翻译

Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting

Benjamin Wilson , William Qi , Tanmay Agarwal , John Lambert , Jagjeet Singh , Siddhesh Khandelwal , Bowen Pan , Ratnesh Kumar , Andrew Hartnett , Jhony Kaesemodel Pontes

分类：计算机视觉 | 人工智能 | 机器学习 | 机器人

2023-01-02

We introduce Argoverse 2 (AV2) - a collection of three datasets for perception and forecasting research in the self-driving domain. The annotated Sensor Dataset contains 1,000 sequences of multimodal data, encompassing high-resolution imagery from seven ring cameras, and two stereo cameras in addition to lidar point clouds, and 6-DOF map-aligned pose. Sequences contain 3D cuboid annotations for 26 object categories, all of which are sufficiently-sampled to support training and evaluation of 3D perception models. The Lidar Dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose. This dataset is the largest ever collection of lidar sensor data and supports self-supervised learning and the emerging task of point cloud forecasting. Finally, the Motion Forecasting Dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene. Models are tasked with the prediction of future motion for "scored actors" in each scenario and are provided with track histories that capture object location, heading, velocity, and category. In all three datasets, each scenario contains its own HD Map with 3D lane and crosswalk geometry - sourced from data captured in six distinct cities. We believe these datasets will support new and existing machine learning research problems in ways that existing datasets do not. All datasets are released under the CC BY-NC-SA 4.0 license.

translated by 谷歌翻译

Mapping smallholder cashew plantations to inform sustainable tree crop expansion in Benin

Leikun Yin , Rahul Ghosh , Chenxi Lin , David Hale , Christoph Weigl , James Obarowski , Junxiong Zhou , Jessica Till , Xiaowei Jia , Troy Mao

分类：计算机视觉 | 机器学习

2023-01-01

Cashews are grown by over 3 million smallholders in more than 40 countries worldwide as a principal source of income. As the third largest cashew producer in Africa, Benin has nearly 200,000 smallholder cashew growers contributing 15% of the country's national export earnings. However, a lack of information on where and how cashew trees grow across the country hinders decision-making that could support increased cashew production and poverty alleviation. By leveraging 2.4-m Planet Basemaps and 0.5-m aerial imagery, newly developed deep learning algorithms, and large-scale ground truth datasets, we successfully produced the first national map of cashew in Benin and characterized the expansion of cashew plantations between 2015 and 2021. In particular, we developed a SpatioTemporal Classification with Attention (STCA) model to map the distribution of cashew plantations, which can fully capture texture information from discriminative time steps during a growing season. We further developed a Clustering Augmented Self-supervised Temporal Classification (CASTC) model to distinguish high-density versus low-density cashew plantations by automatic feature extraction and optimized clustering. Results show that the STCA model has an overall accuracy of 80% and the CASTC model achieved an overall accuracy of 77.9%. We found that the cashew area in Benin has doubled from 2015 to 2021 with 60% of new plantation development coming from cropland or fallow land, while encroachment of cashew plantations into protected areas has increased by 70%. Only half of cashew plantations were high-density in 2021, suggesting high potential for intensification. Our study illustrates the power of combining high-resolution remote sensing imagery and state-of-the-art deep learning algorithms to better understand tree crops in the heterogeneous smallholder landscape.

translated by 谷歌翻译